Speaker age estimation using i-vectors

نویسندگان

  • Mohamad Hasan Bahari
  • Mitchell McLaren
  • Hugo Van hamme
  • David A. van Leeuwen
چکیده

In this paper, a new approach for age estimation from speech signals based on i-vectors is proposed. In this method, each utterance is modeled by its corresponding i-vector. Then, a Within-Class Covariance Normalization technique is used for session variability compensation. Finally, a least squares support vector regression (LSSVR) is applied to estimate the age of speakers. The proposed method is trained and tested on telephone conversations of the National Institute for Standard and Technology (NIST) 2010 and 2008 speaker recognition evaluation databases. Evaluation results show that the proposed method yields significantly lower mean absolute error and higher Pearson correlation coefficient between chronological speaker age and estimated speaker age compared to different conventional schemes. The obtained relative improvements of mean absolute error and correlation coefficient compared to our best baseline system are around 5% and 2% respectively. Finally, the ∗Corresponding author. Tel:+32-(0)16-32.85.45. Fax:+32-(0)16-32.17.23. Email addresses: [email protected] (Mohamad Hasan Bahari), [email protected] (Mitchell McLaren), [email protected] (Hugo Van hamme), [email protected] (David A. van Leeuwen) Preprint submitted to Engineering Applications of Artificial Intelligence May 11, 2014 effect of some major factors influencing the proposed age estimation system, namely utterance length and spoken language are analyzed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Age Estimation from Telephone Speech using i-vectors

Motivated by the success of i-vectors in the field of speaker recognition, this paper proposes a new approach for age estimation from telephone speech patterns based on i-vectors. In this method, each utterance is modeled by its corresponding ivector. Then, Support Vector Regression (SVR) is applied to estimate the age of speakers. The proposed method is trained and tested on telephone conversa...

متن کامل

Exploring ANN back-ends for i-vector based speaker age estimation

We address the problem of speaker age estimation using ivectors. We first compare different i-vector extraction setups and then focus on (shallow) artificial neural net (ANN) backends. We explore ANN architecture, training algorithm and ANN ensembles. The results on NIST 2008 and 2010 SRE data indicate that, after extensive parameter optimization, ANN back-end in combination with i-vectors reac...

متن کامل

Speaker Age Classification and Regression Using i-Vectors

In this paper, we examine the use of i-vectors both for age regression as well as for age classification. Although i-vectors have been previously used for age regression task, we extend this approach by applying fusion of i-vectors and acoustic features regression to estimate the speaker age. By our fusion we obtain a relative improvement of 12.6% comparing to solely ivector system. We also use...

متن کامل

Robust Speaker Recognition Using MAP Estimation of Additive Noise in i-vectors Space

In the last few years, the use of i-vectors along with a generative back-end has become the new standard in speaker recognition. An i-vector is a compact representation of a speaker utterance extracted from a low dimensional total variability subspace. Although current speaker recognition systems achieve very good results in clean training and test conditions, the performance degrades considera...

متن کامل

Shared latent subspace modelling within Gaussian-Binary Restricted Boltzmann Machines for NIST i-Vector Challenge 2014

This paper presents a novel approach to speaker subspace modelling based on Gaussian-Binary Restricted Boltzmann Machines (GRBM). The proposed model is based on the idea of shared factors as in the Probabilistic Linear Discriminant Analysis (PLDA). GRBM hidden layer is divided into speaker and channel factors, herein the speaker factor is shared over all vectors of the speaker. Then Maximum Lik...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Eng. Appl. of AI

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2014